Author Profiling using Complementary Second Order Attributes and Stylometric Features

نویسندگان

  • Konstantinos Bougiatiotis
  • Anastasia Krithara
چکیده

In this paper we present an approach for the task of author profiling. We propose a modular framework, extracting two main group of features, combined with appropriate preprocessing, implementing Support Vector Machines for classification. The two main groups we used were stylometric and discriminative, featuring trigrams on one hand and complementary-weighted Second Order Attributes on the other. We address the problem as a profile based problem creating target profiles and also grouping each user’s tweets in the same document.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Author Profiling Approach Based on Language-dependent Content and Stylometric Features

We describe the approach that we submitted to the 2015 PAN competition [5] for the author profiling task. The task consists in predicting some attributes of an author analyzing a set of his/her Twitter tweets. We consider several sets of stylometric and content features, and different decision algorithms: we use a different combination of features and decision algorithm for each language-attrib...

متن کامل

Author Profiling using Stylometric and Structural Feature Groupings

In this paper we present an approach for the task of author profiling. We propose a coherent grouping of features combined with appropriate preprocessing steps for each group. The groups we used were stylometric and structural, featuring among others, trigrams and counts of twitter specific characteristics. We address gender and age prediction as a classification task and personality prediction...

متن کامل

Grammar Checker Features for Author Identification and Author Profiling Notebook for PAN at CLEF 2013

Our work on author identification and author profiling is based on the question: Can the number and the types of grammatical errors serve as indicators for a specific author or a group of people? In order to detect the grammatical errors we base our approach on the output of the open-source library LanguageTool. In the case of the author identification we transform the problem into a statistica...

متن کامل

Using Textual Transcripts of Parliamentary Interventions for Profiling Portuguese Politicians

This paper presents an experimental study on the subject of profiling political actors through textual transcriptions of their parliamentary interventions. Supervised learning techniques were used to learn models, which attempt to classify Portuguese politicians according to their gender, their age group, or their political affiliation and orientation. Experiments were made using different type...

متن کامل

Exploring Performance-Based Music Attributes for Stylometric Analysis

Music Information Retrieval (MIR) and modern data mining techniques are applied to identify style markers in midi music for stylometric analysis and author attribution. Over 100 attributes are extracted from a library of 2830 songs then mined using supervised learning data mining techniques. Two attributes are identified that provide high informational gain. These attributes are then used as st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016